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Abstract 



Inverse problems are often ill-posed, with solutions that depend sensitively on 
data. In any numerical approach to the solution of such problems, regularization 
\Q ' of some form is needed to counteract the resulting instability. This paper is 

based on an approach to regularization, employing a Bayesian formulation of 
^ . the problem, which leads to a notion of well-posedness for inverse problems, at 

\ the level of probability measures. 

Q ■ The stability which results from this well-posedness may be used as the basis 

^ . for quantifying the approximation, in finite dimensional spaces, of inverse prob- 

' lems for functions. This paper contains a theory which utilizes the stability to 

estimate the distance between the true and approximate posterior distributions, 
^ ■ in the Hellinger metric, in terms of error estimates for approximation of the un- 

derlying forward problem. This is potentially useful as it allows for the transfer 
of estimates from the numerical analysis of forward problems into estimates for 
the solution of the related inverse problem. In particular controlling differences 
in the Hellinger metric leads to control on the differences between expected val- 
ues of polynomially bounded functions and operators, including the mean and 
covariance operator. 

The ideas are illustrated with the classical inverse problem for the heat equa- 
tion, and then applied to some more complicated non-Gaussian inverse prob- 
lems arising in data assimilation, involving determination of the initial condi- 
tion for the Stokes or Navier-Stokes equation from Lagrangian and Eulerian 
observations respectively. 
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1 Introduction 

In applications it is frequently of interest to solve inverse problems [fT5ll26ll : to find 
u, an input to a mathematical model, given y an observation of (some components 
of, or functions of) the solution of the model. We have an equation of the form 

y = G(u) (1.1) 

to solve for u E X, given y eY, where X, Y are Banach spaces. We refer to eval- 
uating Q as solving the forward problen% We refer to y as data or observations. 
It is typical of inverse problems that they are ill-posed: there may be no solution, 
or the solution may not be unique and may depend sensitively on y. For this rea- 
son some form of regularization is often employed [7J to stabilize computational 
approximations. 

We adopt a Bayesian approach to regularization [2J which leads to the notion 
of finding a probability measure jj, on X, containing information about the relative 
probability of different states u, given the data y. Adopting this approach is natural 
in situations where an analysis of the source of data reveals that the observations y 
are subject to noise. A more appropriate model equation is then often of the form 

y = <3{u) + r] (1.2) 

where r/ is a mean-zero random variable, whose statistical properties we might 
know, or make a reasonable mathematical model for, but whose actual value is 
unknown to us; we refer to rj as the observational noise. We assume that it is 
possible to describe our prior knowledge about u, before acquiring data, in terms 
of a prior probability measure Hq. It is then possible to use Bayes' formula to 
calculate the posterior probability measure fi for u given y. 

In the infinite dimensional setting the most natural version of Bayes theorem 
is a statement that the posterior measure is absolutely continuous with respect to 
the prior [|25l and that the Radon-Nikodym derivative (density) between them is 
determined by the data likelihood. This gives rise to the formula 

T^iu) = i^,exp{-<^(u;y)) (1.3) 

where the normalization constant Z(y) is chosen so that n is a probability mea- 



sure: 



Ziy)= / exp{-<l>(u;y))dno(u). (1.4) 
Jx 



'in the applications in this paper Q is found from composition of the forward model with some 
form of observation operator, such as pointwise evaluation at a finite set of points. The resulting 
observation operator is often denoted with the letter H in the atmospheric sciences community 
lfT2il : because we need Ti for Hilbert space later on, we use the symbol Q. 
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In the case where y is finite dimensional and rj has Lebesegue density p this is 
simply 

^(u) ^ p(y - g(u)). (1.5) 

More generally $ is determined by the distribution of y given u. We call y) 
the potential, and sometimes, for brevity, refer to evaluation of y) for a par- 
ticular M G X, as solving the forward problem as it is defined through Note 
that the solution to the inverse problem is a probability measure fi which is defined 
through a combination of solution of the forward problem Q, the data y and a prior 
probability measure jiQ. 

In general it is hard to obtain information from a formula such as (11.31) for a 
probability measure. One useful approach to extracting information is to use sam- 
pling: generate a set of points {u^^^}^^^ distributed (perhaps only approximately) 
according to fi. In this context it is noteworthy that the integral Z{y) appearing in 
formula (11.31 ) is not needed to enable implementation of MCMC methods to sam- 
ple from the desired measure. These methods incur an error which is well under- 
stood and which decays as \/K ifTTl . However for inverse problems on function 
space there is a second source of error, arising from the need to approximate the 
inverse problem in a finite dimensional subspace of dimension N . The purpose 
of this paper is to quantify such approximation errors. The key idea is that we 
transfer approximation properties of the forward problem $ into approximation 
properties of the inverse problem defined by (11.31) . 

Since the solution to the Bayesian inverse problem is a probability measure we 
will need to use metrics on probability measures to quantify the effect of approxi- 
mation. We will employ the Hellinger metric d^^^^^ from Definition |A.2| because this 
leads directly to bounds on the approximation error incurred when calculating the 
expectation of functions. This property is summarized in Lemma lAJl Combin- 
ing these ideas we will find that finite dimensional approximation leads to an error 
in the calculation of expectation of functions which tends to zero as V^(iV) tends 
to infinity, for some function ip^N) determined by approximation of the forward 
problem. 

In section [2] we provide the general approximation theory, for measures /i 
given by (11.31) . upon which the remainder of the paper builds. Section |3] employs 
this approximation theory to study the classical inverse problem of determining 
the initial condition for the heat equation from observation of the solution at a 
positive time. In section]?] we study the inverse problem of determining the initial 
condition for the Stokes equation, given a finite set of observations of Lagrangian 
trajectories defined through the time-dependent velocity field solving the Stokes 
equation; this section also includes numerical results showing the convergence 
of the posterior distribution under refinement of the finite dimensional approxi- 
mation, as predicted by the theory. Section [5] is devoted to the related inverse 
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problem of determining the initial condition for the Navier-Stokes equation, given 
direct observation of the time-dependent velocity field at a finite set of points at 
positive times. 

A classical approach to the regularization of inverse problems is through the 
least squares approach and Tikhonov regularization [T] |26l ; a good overview of 
this approach, in the context of data assimilation problems in fluid mechanics 
such as those studied in sections |4] and |51 is [|T9l and the connection between 
the least squares and Bayesian approaches for applications in fluid mechanics is 
overviewed in [IJ. The Bayesian formulation to inverse problems in general is 
overviewed in the text [14J. Note, however, that the methodology employed there 
is typically one in which the problem is first discretized, and then ideas from 
Bayesian statistics are applied to the resulting finite dimensional problem. The 
approach taken in this paper is to first formulate the Bayesian inverse problem on 
function space and then study approximation. As in many areas of applied math- 
ematics - for example, optimal control - formulation of the problem in function 
space, followed by discretization will lead to better algorithms and better under- 
standing. This approach is laid out conceptually in [|26l for inverse problems, but 
the underlying mathematics is not developed, except for some particular linear 
and Gaussian problems. Indeed, for linear problems, the Bayesian approach on 
function space may be found in an early paper of Franklin [8|, including study 
of the heat equation, the subject of section |3l More recently there has been some 
work on finite dimensional linear inverse problems, using the Bayesian approach 
to regularization, and considering infinite dimensional limits [fTOl ITSl and in the 
limit of disappearing observational noise [11]. A general approach to the formu- 
lation, and well-posedness, of inverse problems, adopting a Bayesian approach on 
function space, is undertaken in [5 |; furthermore applications to problems in fluid 
mechanics are given in that paper and we will build on this material in sections |4] 
and 151 

2 General Framework 

In this section we establish three useful results which concern the effect of ap- 
proximation on the posterior probability measure ^ given by (11.31 ). These three 
results are Theorem 12. 4[ Corollary 12.51 and Theorem 12.61 The key point to notice 
about these results is that they simply require the proof of various bounds and ap- 
proximation properties/or the forward problem, and yet they yield approximation 
results concerning the Bayesian inverse problem. The connection to probability 
comes only through the choice of the space X, in which the bounds and approxi- 
mation properties must be proved, which must have full measure under the prior 
Ho. 
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The probability measure of interest (11.31) is defined through a density with 
respect to a prior reference measure fiQ which, by shift of origin, we take to have 
mean zero. Furthermore, we assume that this reference measure is Gaussian with 
covariance operator C. We write /io = A/'(0, C). In fact we only use the Femique 
Theorem IA.4I for and the results may be trivially extended to all measures 
which satisfy the conclusion of this theorem. The Femique Theorem holds for all 
Gaussian measures on a separable Banach space and also for other measures 
with tails which decay at least as fast as a Gaussian. 

It is demonstrated in [25| that in many applications, including those consid- 
ered here, the potential $(■; y) satisfies certain natural bounds on a Banach space 

(^X, \\-\\x^, contained in the original Hilbert space on which jio is defined, and of 
full measure under /ig so that fio(X) = 1. Such bounds are summarized in the fol- 
lowing assumptions. We assume that the data y lies in a Banach space (y, \\ ■ ||y j . 

The key point about the form of As sumption 12. If i) is that it allows use of the Fer- 
nique Theorem to control integrals against yU. The assumption (ii) may be used to 
obtain lower bounds on the normalization constant Z{y). 

Assumption 2.1 For some Banach space X with Ho{X) = 1, the function $ : 
X X Y ^ M satisfies the following: 

i) for every e > and r > there is M = M{e, r) G M such that, for all 
u E X andy G Y with ||?/||y < r, 

^{u;y) ^M-e\\u\\\; 

ii) for every r > there is a L = L{r) > such that, for allu E X and y eY 
with max{||M||x, lbl|y} < f^, 

y) ^ L(r). 

For Bayesian inverse problems in which a finite number of observations are 
made and the observation error r] is mean zero Gaussian, the potential $ has the 
form 

^(u;y) = ^\y-giu)\l (2.1) 

where y E M™ is the data, Q : X ^ M."^ is the forward model and | ■ |r is a 
covariance weighted norm on M™ given by|-|r = |r~2.| and | ■ | denotes the 
standard Euclidean norm. In this case it is natural to express conditions on the 
measure /i in terms of Q. 
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Assumption 2.2 For some Banach space X with fioi^) = 1- the function Q : 
X —>■ satisfies the following: for every e > there is M = M(e) G M such 
that, for all m G X, 

\g(u)\r ^ exp(e||M||^ + M). 

Lemma 2.3 Assume that $ : X x M™ — > M is given by (|2.1I) and let Q satisfy 
Assumptions \2.2\ Assume also that fio is a Gaussian measure satisfying /io(X) = 
1. Then $ satisfies Assumptions \2.1\ 

Proof. As sumption 12 .U i) is automatic since $ is positive; assumption (ii) follows 
from the bound 

^(u;y)^\y\l+\giu)\l 
and use of the exponential bound on ^. □ 

Since the dependence on y is not relevant we suppress it notationally and study 
measures fi given by 

^(n) = iexp(-<|.(«)) (2.2) 
where the normalization constant Z is given by 

exp{-^(u))diio(u). (2.3) 

X 

We approximate n by approximating $ over some iV— dimensional subspace of 
X. In particular we define /i^ by 

J^(«) = — exp(-$^(n)) (2.4) 

where 

= exp(-$^(M))c//io(M). (2.5) 
Jx 

The potential $^ should be viewed as resulting from an approximation to the so- 
lution of the forward problem. Our interest is in translating approximation results 
for $ into approximation results for fi. 

The following theorem proves such a result, bounding the Hellinger distance, 
and hence by (IA.6I) the total variation distance, between measures p and p^, in 
terms of the error in approximating $. Again the particular exponential depen- 
dence of the error constant for the forward approximation is required so that we 
may use the Femique Theorem to control certain expectations arising in the anal- 
ysis. 
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Theorem 2.4 Assume that $ and $^ satisfy Assumptions \2.1[ i}.( ii ) with con- 
stants uniform in N. Assume also that, for any £ > 0, there is K = K(e) > 
such that 

\^(u) - ^ K exp{e\\ufx)^{N) (2.6) 

where ip(N) as N ^ oo. Then the measures ji and /i^ are close with respect 
to the Hellinger distance: there is a constant C, independent ofN, and such that 

d^,„(fi,fi'')^C^(N). (2.7) 

Consequently all moments of \\u\\x are 0{tlj(N)) close. In particular the mean 
and, in the case X is a Hilbert space, the covariance operator, are 0{ip(N)) close. 

Proof. Throughout the proof, all integrals are over X. The constant C may de- 
pend upon r and changes from occurrence to occurrence. Using Assumption 
[2lTii) gives 

\Z\ ^ / exp(-L(r))d/io(M) > exp(-L(r))/xo{||M|U ^ '^l- 

This lower bound is positive because no has full measure on X and is Gaussian so 
that all balls in X have positive probability. We have an analogous lower bound 
for 

From Assumptions 12. 1 T i) and (12.61) . using the fact that hq is a Gaussian prob- 
ability measure so that the Fernique Theorem lA.41 applies . 

\Z-Z^\ ^ j Ki){N)exY>{e\\u\\\- M)ex^{e\\u\\\)d^o{u) 
^ C%l){N). 

From the definition of Hellinger distance we have 

2ciHe>,(/i,/x^)' = j (z-^exp(-i<l>(n))-(Z^)-iexp(-^<l>^(n)))Vo(^) 
where 

/i = -| y {ex^{~^{u)) - exp(-^<l>^(u))) rf/io(n), 
h = 2\Z--^ - (Z^r^^l^ j exp(-$^(M))rf/io(M)- 
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Now, again using Assumptions 12. U i) and equation (12.61) . together with the 
Femique Theorem lA.4l 



2 



Note that the bounds on Z, from below are independent of A^. Further- 
more, 

exp(— $^(M))(iyUo(M) ^ / exp(£:||M|||' — M)dfio(u) 



with bound independent of A^, by the Femique Theorem IA.4I Thus 

Combining gives the desired continuity result in the Hellinger metric. 

Finally all moments of m in X are finite under the Gaussian measure no by 
the Femique Theorem IA.4I It follows that all moments are finite under /i and /x^ 
because, for f : X ^ Z polynomially bounded, 

E^ll/ll ^ (E^«||/|n^(E^Oexp(-2$(n;2/)))^ 

and the first term on the right hand side is finite since all moments are finite under 
/io, whilst the second term may be seen to be finite by use of Assumption 12. U i) 
and the Femique Theorem IA.4I □ 

For Bayesian inverse problems with finite data the potential $ has the form 
given in (12.11) where y E M'" is the data, ^ : X — > R™ is the forward model and 
I ■ |r is a covariance weighted norm on R™ . In this context the following corollary 
is useful. 

Corollary 2.5 Assume that $ is given by (12.11) and that Q is approximated by a 
function with the property that, for any e > 0, there is K' = K'{e) > such 
that 

\Giu) - g^'iu)] ^ K' exp{e\\u\\l)ij(N) (2.8) 

where ijj(N) as N ^ oo. IfQ andQ^ satisfy Assumptions M. 2\ uniformly in N 
then $ and $^ := \\y — Q^{u)\y satisfy the conditions necessary for application 
of Theorem \2.4\ and all the conclusions of that theorem apply. 
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Proof. That (i), (ii) of Assumptions 12.11 hold follows as in the proof of Lemma 
12.31 Also (12.61) holds since (for some K{-) defined in the course of the following 
chain of inequalities) 

\^{u) - <I>^(m)| ^ ]^\2y - G{u) - G'' {u)\t\G{u) - G''{u)\v 

^ (\y\+eM4A\ + M)) ^ K'{e)exY>{e\\u\\\)ij{N) 
^ K{2e)ex.^{2e\\u\W)i){N) 

as required. □ 

A notable fact concerning Theorem l2.4l is that the rate of convergence attained 
in the solution of the forward problem, encapsulated in approximation of the func- 
tion $ by is transferred into the rate of convergence of the related inverse 
problem for measure fi given by (|2.2I) and its approximation by yU^. Key to achiev- 
ing this transfer of rates of convergence is the dependence of the constant in the 
forward error bound (12.61) on u. In particular it is necessary that this constant is 
integrable by use of the Femique Theorem IA.4I In some applications it is not 
possible to obtain such dependence. Then convergence results can sometimes still 
be obtained, but at weaker rates. We now describe a theory for this situation. 

Theorem 2.6 Assume that $ and $^ satisfy Assumptions 12. ii ) with con- 
stants uniform in N. Assume also that, for any R > there is K = K(R) > 
such that, for all u with ^ R> 

- ^ Ki){N) (2.9) 

where ipiN) ^ as N oo. Then the measures fi and are close with respect 
to the Hellinger distance: 

4e>,(/i,/i'^)^0 (2.10) 

as N oo. Consequently all moments of \\u\\x under fi^ converge to corre- 
sponding moments under fi as N oo. In particular the mean and, in the case 
X is a Hilbert space, the covariance operator, converge. 

Proof. Throughout the proof, all integrals are over X unless specified otherwise. 
The constant C changes from occurrence to occurrence. The normalization con- 
stants Z and satisfy lower bounds which are identical to that proved for Z in 
the course of establishing Theorem 12.41 
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From Assumptions 12. If i) and (12.91) . 

|Z - Z^l ^ / I exp(-$(u)) - exp{-<^^{u))\dfio 
Jx 

^ / exp{e\\ufx - M)\<t>(u)-<^^(u)\dfio(u) 

J{\\u\\x^R} 

+ / 2 exp(£:||n||^ — M)(i/io(M) 

J{\\u\\x>R} 

^ exp(£i?2 - M)K(R)ij(N) + Jr 
:= K,(RmN) + Jr. 



Here 

Jr= I 2ex^{e\\u\\\~ M)dnQ{u). 

J{\\u\\x>R} 

Now, again by the Fernique Theorem IA.4[ J/j ^ as i? ^ oo so, for any 
5 > 0, we may choose i? > such that Jr < 5. Now choose > so that 
Ki(R)ip(N) < 6 to deduce that \Z - Z'^\ < 26. Since 5 > is arbitrary this 
proves that ^ Z as N ^ oo. 

From the definition of Hellinger distance we have 

2ciHen(/i,/x^)' = j (z-^exp(-^<l>(n))-(Z^)~iexp(-^<l>^(n)))Vo(^) 
where 



J (exp(-^<l>(M)) -exp(-^<l>^(n)))'d/io(u), 
h = 2\Z--2 - (Z^)-5|2 f exp(-$^(M))c//io(M). 



Now, again using Assumptions 12. 1 f i) and equation (12.91) , 

J{\\u\\x^R} 

+ ^ 2exp{e\\ufx-M)dfio(u) 

^ ^illnll^>m 



\H\x>R} 

^ ^K,(R)ij(Nf + pR, 

for suitably chosen K2 = K2{R). An argument similar to the one above for \Z — 
Z^\ shows that h^Qd&N ^ 00. 
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Note that the bounds on Z, Z from below are independent of A^. Further- 
more, 

J exp(— $^(M))(i/io(M) ^ J exp(£:||-u|||' — M)djj,o(u) 
with bound independent of N, by the Femique Theorem IA.4I Thus 

\z--2 - (z^y-2\^ <: c{z-^ V - z^p 

and so /2 — > as ^ oo. Combining gives the desired continuity result in the 
Hellinger metric. 

The proof may be completed by the same arguments used in Theorem 12. 4[ 

□ 

3 The Heat Equation 

Here we consider a problem where the solution of the heat equation is noisily 
observed at some fixed positive time T > 0. To be concrete we consider the heat 
equation on a bounded open set D C M'^, with Dirichlet boundary conditions, and 
written as an ODE in Hilbert space Ti = L^iD): 

dv 

— + Av = 0, v{0) = u. (3.1) 
at 

Here ^ = - A with D(A) = H^{D) f] H\D). We define the Sobolev spaces H' 
as in (IA.2I) with H = = L^(D). We assume sufficient regularity conditions 
on D and its boundary dD to ensure that the operator A is the generator of an 
analytic semigroup and we use (IA.4I) without comment in what follows. 

Assume that we observe the solution v at time T, subject to error in the form of 
a Gaussian random field, and that we wish to recover the initial condition u. This 
problem is classically ill-posed, because the heat equation is smoothing, and in- 
version of this operator is not continuous on any Sobolev space Nonetheless, 
we will construct a well-defined Bayesian inverse problem. We state a theorem 
showing that the posterior measure is equivalent (in the sense of measures) to the 
prior measure and then study the effect of approximation via a spectral method 
in Theorem 13. 3[ showing that the approximation error in the inverse problem is 
exponentially small. 

We place a prior measure on u which is a Gaussian measure fiQ ~ J\f{mQ, Cq) 
with Cq = PA~", for some /9 > 0, a > |. The lower bound on a ensures that 
samples from the prior are continuous functions (Lemma [A. 5 1 ). 

We assume that the observation is a function y E H and we model it as 

y = e-^^u + ^ (3.2) 
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where rj ~ N'iO, Ci) and Ci = 5A~'^ for some 5 > and 7 > d/2 so that t] is 
almost surely continuous, by Lemma IA31 The forward model Q : H ^ H is 
given by Q{u) = e^^^u. 

By conditioning the Gaussian random variable {u,y) E H x H we find that 
the posterior measure for u\y is also Gaussian J\f{m, C) with mean 

m = mo + ^e-^^A^-" (/ + ^e'^^^A^-") - e-^^mo) (3.3) 
and covariance operator 

C = (^/ + ^e-2^^A^-") (3.4) 

We can also derive a formula for the Radon-Nikodym derivative between 
jj(du) = F(du\y) and the prior j2o(du). We define $ : x — M by 

<l>iu;y) = ^\\C;''e-^^uf - {C^^-^'^u^cf^y) . (3.5) 

It is a straightforward application of the theory of Gaussian measures (Si llTI . 
using the continuity properties of $ established below, to prove the following: 

Theorem 3.1 / [25l/ Consider the inverse problem for the initial condition u in 
(|3.1I) . subject to observation in the form (13.21) with observational noise t] ~ Af{0, 6A~'^), 
5 > Oand'j > ^.Assume that the prior measure is a Gaussian iiq = J\f{mQ, PA~") 
with niQ G H°', (3 > and a > ^. Then the posterior measure jj is Gaussian with 
mean and variance determined by (13.31 ) and (13.41) . Furthermore, fi and the prior 
measure /iq are equivalent Gaussian measures with Radon-Nikodym derivative 
(fO) given by (1331) . 

Now we study the properties of $. To this end it is helpful to define, for any 
9 > 0, the compact operator Kq : H ^ H given by 

Ke := C;K~'^^. 

Note that, for any < < < 00 there is C > such that, for all u E H, 

\\Kg^u\\ ^ Cll-K'e^'ull. 

Lemma 3.2 The function $ : x — M satisfies Assumptions I2.il with X = 
Y = H and, furthermore, for any e E (0, 1), there is C = C(e) such that 

\Hu; y) - ^{v- y)\ ^ c{\\Kiu\\ + \\Kiv\\ + \\K,y\\) \\Ki.,u - K^^evl 

In particular, $(■; y) : H ^ M.is continuous. 
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Proof. We may write 

By the Cauchy-Schwarz inequality we have, for any 5 > 0, 



so that, by the compactness of Ki, Assumption 12. U i) holds. Assumption [OJii) 
holds, by a similar Cauchy-Schwarz argument, with 

y) ^ \\\C;K-^^ur + ^llCr^e-i^^yf + ^llC^^e-^^^nf 
so that, by the compactness of Kq, 

^{u-y)^c(l + \\ufy (3.6) 

Note that 

Since $ is quadratic in u the desired Lipschitz property holds. □ 

Now we consider approximation of the posterior measure ji given by (13.51) . 
Specifically we define to be orthogonal projection in Ti into the subspace 
{4'k}\k\^N (a subset of the eigenfunctions of A as defined just before (lA.ll )) and 
define the measure /i^ given by 



N 



(u) oc exp (^-^(P^u; y)^ . (3.7) 



The measure fi^ is identical to /io on the orthogonal complement of P^H. We 
now use the theory from the preceding section to estimate the distance between /i 
and /i^. 

Theorem 3.3 There are constants c\ > 0, C2 > 0, independent of N , such that 
du^ifJ', /U^) ^ ci exp(— C2A^^). Consequently the mean and covariance operator 
of fi and fi'^ are 0{exp(—C2N'^)) close in the Ti and Ti— operator norms respec- 
tively. 
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Proof. We apply Theorem 12.41 with X = H. By Lemma |3^ together with the 
fact that ll-P^^^II ^ II n II, we deduce that Assumptions lITI hold for $ and with 
constants independent of N. Furthermore, from the Lipschitz bound in Lemma 
13.21 we have 

|$(m; y) - y)\ ^ c(^\\u\\ + \\y\\^ \\Ki(u - P^m)||. 

But 

\k\>N 

Since the eigenvalues grow like |A;p, and since x''' exp(— xT) is monotonic 
decreasing for x sufficiently large, we deduce that 

\\Ki(u- P^u)f ^ ciexpi-C2N^) \uk\'^ ^ ciexpi-C2N^)\\uf . 

\k\>N 

The result follows (possibly by redefinition of ci, C2). □ 



4 Lagrangian Data Assimilation 

In this section we turn to a non-Gaussian nonlinear example where the full power 
of the abstract theory is required. In oceanography a commonly used method of 
gathering data about ocean currents, temperature, salinity and so forth is through 
the use of Lagrangian instruments: objects transported by the fluid velocity field, 
which transmit positional information using GPS. The inverse problem termed 
Lagrangian data assimilation is to determine the velocity field in the ocean from 
the Lagrangian data lfT3l[T6ll . 

In this section we study an idealized model which captures the essence of 
Lagrangian data assimilation as practised in oceanography. For the fluid flow 
model we use the Stokes equations, describing incompressible Newtonian fluids 
at moderate Reynolds number. The real equations of oceanography are, of course, 
far more complex, requiring evolution of coupled fields for velocity, temperature 
and salinity. However the dissipative and incompressible nature of the flow field 
for the Stokes equations captures the key mathematical properties of ocean flows, 
and hence provides a useful simplified model. 

We consider the incompressible Stokes equations written in the form: 

dv 

— = vAv-Vp + f, (a;,t) e X [0,oo), (4.1a) 



V-w = 0, (x,t) G D X [0,oo), 



(4.1b) 
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v = u, {x,t)eDx{0}. (4.1c) 

Here D is the unit square. We impose periodic boundary conditions on the veloc- 
ity field V and the pressure p. We assume that / has zero average over D, noting 
that this implies the same for v(x, t), provided that u{x) = v{x, 0) has zero initial 
average. See If27l |28l for definitions of the Leray projector P : L^g,. — > H and 
Stokes operator A. We employ the Hilbert spaces {W, \\ ■ \\s} defined by (IA.2I) 
and note that = ^(A'*/^) for any s > 0. 

The PDE can be formulated as a linear dynamical system on the Hilbert space 



n 



ImgL^JD) j udx = 0,V ■u = Oy (4.2) 

with the usual L^(D) norm and inner-product on this subspace of L^^(D). If we 
let tp = Pf then we may write the equation (14.11) as an ODE in Hilbert space H : 

dv 

— + uAv = ip, f (0) = u. (4.3) 
at 

We assume that we are given noisy observations of J Lagrangian tracers with 
positions zj solving the integral equations 

Zj(t) = Zjfi + / v(zj(s), s)ds. (4.4) 



^0 

For simplicity assume that we observe all the tracers z at the same set of pos- 
itive times and that the initial particle tracer positions Zj^Q are known to 
us: 

yj,k = Zj{tk) + r]j^k, j = 1, . . . , J A; = 1, . . . ,K, (4.5) 

where the ?7j ^'s are zero mean Gaussian random variables. Concatenating data we 
may write 

y = Qiu) + 7] (4.6) 

with y = (yi^i, . . . , yj,K)* and rj ~ A/'(0, F) for some covariance matrix F capturing 
the correlations present in the noise. Note that ^ is a complicated function of the 
initial condition for the Stokes equations, describing the mapping from this initial 
condition into the positions of Lagrangian trajectories at positive times. We will 
show that the function Q maps of H into M^^^, and is continuous on a dense 
subspace of H. 

The objective of the inverse problem is to find the initial velocity field u, 
given y. We adopt a Bayesian approach and identify n(du) = F(u\y)du. We now 
spend some time developing the Bayesian framework, culminating in Theorem l4.3l 
which shows that /i is well-defined. The reader interested purely in approximation 
of /X can skip straight to Theorem 1441 
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The following result shows that the tracer equations (14.41) have a solution, un- 
der mild regularity assumptions on the initial data. An analogous result is proved 
in [[6l for the case where the velocity field is governed by the Navier-Stokes equa- 
tion and the proof may be easily extended to the case of the Stokes equations. 

Theorem 4.1 Let ip G ^^(0, T; H) and let v G C([0, T]; H) denote the solution 
of (|4.3I) with initial data u E Ti. Then the integral equation (14.41) has a unique 
solution z e C([0,T],M2). 

We assume throughout that ip is sufficiently regular that this theorem applies. 
To determine a formula for the probability of u given y, we apply the Bayesian 
approach described in [5J for the Navier-Stokes equations, and easily generalized 
to the Stokes equations. For the prior measure we take fio = A/^(0, /5y4~") for 
some P > 0, a > 1, with the condition on a chosen to ensure that draws from the 
prior are in H, by Lemma IA3I We condition the prior on the observations, to find 
the posterior measure on u. The likelihood of y given u is 

¥{y\u)(x exp(^-^|y - g{u)\l^ 
This suggests the formula 

^(u) ocexp (4.7) 

where 

Hu;y):=^\y-g{u)\l (4.8) 

and /io is the prior Gaussian measure. We now make this assertion rigorous. The 
first step is to study the properties of the forward model Q. Proof of the follow- 
ing lemma is given after statement and proof of the main approximation result. 
Theorem 14. 4[ 



Lemma 4.2 Assume that ip G C([0, T]; 7i^)/or 5ome 7 ^ 0. Consider the for- 
ward model g -.n^ M^^'^ defined by K5\i.K6^. 

• If'y^O then, for any £ ^ there is C > such that, for all u G 7i^, 

\G{u)\^C{l + \\u\\,). 

• If'j>0 then, for any i > and R > and for all ui, U2 with V 
11^2 II £ < R, there is L = L(R) > such that 

\g(ui) - g{u2)\ ^ L\\ui - U2\\e. 

Furthermore, for any £ > 0, there is M > such that L(R) ^ M exp(eR^). 
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Thus Q satisfies Assumptions 12. 21 with X = and any s ^ 0. 

Since Q is continuous on for £ > and since, by Lemma IA.51 draws 
from /io are almost surely in for any s < « — 1, use of the techniques in |I5|, 
employing the Stokes equation in place of the Navier-Stokes equation, shows the 
following: 

Theorem 4.3 Assume that ip G C([0, T]; H'^), for some 7 > 0, and that the prior 
measure /io = A^CO, is chosen with j3 > and a > 1. Then the measure 

fi(du) = F(du\y) is absolutely continuous with respect to the prior ^^{du), with 
Radon-Nikodym derivative given by (I4.7I ). 

In fact the theory in [HI may be used to show that the measure fi is Lipschitz 
in the data y, in the Hellinger metric. This well-posedness underlies the following 
study of the approximation of /i in a finite dimensional space. We define to 
be orthogonal projection in H into the subspace {(f)k}\k\<iN', recall that k E K := 
Z^\{0}. Since is an orthogonal projection in any we have ||P^n||x ^ 
Define 



The approximate posterior measure fi^ is given by (14.71) with Q replaced by . 
As in the last section it is identical to the prior on the orthogonal complement of 
P^H. On P'^H itself the measure is finite dimensional and amenable to sam- 
pling techniques as demonstrated in [4J. We now quantify the error arising from 
approximation of Q in the finite dimensional subspace P^ X. 

Theorem 4.4 Let the assumptions of Theorem \4.3\ hold. Then, for any q < a — 1, 
there is a constant c > 0, independent of N, such that rfaeiiC/^, /W^) ^ cN^'^. 
Consequently the mean and covariance operator of 11 and /x^ are 0{N^'^) close 
in the TC and TC— operator norms respectively. 

Proof. We set X = W for any s G (0, « — 1). We employ Corollary 12. 5[ Clearly, 
since Q satisfies Assumptions 12. 21 by Lemma |431 so too does , with constants 
uniform in N. It remains to establish (12.81) . Write u E as 



g^iu) := GiP^'u). 



u = 




and note that 



Y,\k\'>k\' < oc. 
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We have, for any i E (0, s), 

i2e 



u — P^u\\1 = |A;|""^|-u/c| 

\k\>N 



\Uk\ 



\k\>N 

\k\>N 

^ C\\u\\lN-^<-'-'\ 

By the Lipschitz properties of Q from Lemma 14.21 we deduce that, for any i G 

(0,s), 

\giu) - giP^u)\ ^ M exp{e\\u\\j)\\u - P^u\\e 

^ cHl exp{e\\u\\l)\\u\\sN-''-^\ 

This establishes the desired error bound (|2.8I) . It follows from Corollary 12.51 that 
/i^ is (9(A^^^'*~^^) close to fi in the Hellinger distance. Choosing s arbitrarily close 
to its upper bound, and i arbitrarily close to zero, yields the optimal exponent q as 
appears in the theorem statement. □ 

Proof, of Lemma \4.2\ Throughout the proof, the constant C may change from 
instance to instance, but is always independent of the Ui. It suffices to consider a 
single observation so that J = K = 1. Let z^^\t) solve 

^«(t) = 4)+ [ t;«(2«(r), r)rfr 



where v^'^\x, t) solves (14.11) with u = ui. 
Let ^ G [0, 2 + 7). Recall that, by (1X51) . 



\v'Ht)\\s ^ C7 — — ll^llc™^^^ (4.9) 



for s G [£, 2 + 7). Also, by linearity and (IA.4|) . 

U\t) - v'^\t)\l ^ -^^\\ui - u^h. (4.10) 
To prove the first part of the lemma note that, by the Sobolev embedding The- 
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orem, for any s > 1, 



^C{1+ I \\v^%,T)\UdT 

For any 7^0 and £ G [0, 2 + 7) we may choose s such that s E [£,2 + 7) f](l, i + 
2). Thus the singularity is integrable and we have, for any t ^ 0, 

\z''\t)\^C{l + \\u,l) 

as required. 

To prove the second part of the lemma choose £ G (0, 2 + 7) and then choose 
s G [^ — 1, l + 7)n(l, £+1); this requires 7 > to ensure a nonempty intersection. 
Then 



(4.11) 



Now we have 

\z^^\t) - z'^^\t)\ ^ \Z^^\0) - Z^'^\0)\ + [ \v'^^\z^^\t), t) - v''^\z'^\t), T)\dT 

Jo 

^ t\\Dv^\,T)ho.\z^'\T)-Z^\T)\dT 



+ / \W\,r)-v^^\;T)\\Lo.dr 
Jo 

^ t\\v^\,T)\\,+s\z^'\T)-Z^'\T)\dT 



+ I ||t;«(-,r)-t;(2)(.,r)|Lrfr 



J 

J 

Both time singularities are integrable and application of the Gronwall inequality 
from Lemma lATTI gives, for some C depending on and ||^||c([o,t];W^), 

11^(1) _ 2;(2)||^^^^g^^^.jg2^ ^ C'll^^ _ 
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The desired Lipschitz bound on Q follows. In particular, the desired dependence 
of the Lipschitz constant follows from the fact that, for any e > there is M > 
with the property that, for all 6* ^ 0, 

1 + ^exp(^) ^ Mexp{e9^). 

□ 

We conclude this section with the results of numerical experiments illustrat- 
ing the theory. We compute the posterior distribution on the initial condition 
for Stokes equations from observation of J Lagrangian trajectories at one time 
t = 0.1. The prior measure is taken to be MiO, 400 x A^'^). The initial condition 
used to generate the data is found by making a single draw from the prior measure 
and the observational noise on the Lagrangian data is i.i.dJ\f{0, 7^) with 7 = 0.01. 

Note that, in the periodic geometry assumed here, the Stokes equations can be 
solved exactly by Fourier analysis GSll . Thus there are four sources of approxi- 
mation when attempting to sample the posterior measure on u. These are 

• (i) the effect of generating approximate samples from the posterior measure 
by use of MCMC methods; 

• (ii) the effect of approximating n in a finite space found by orthogonal pro- 
jection on the eigenbasis of the Stokes operator; 

• (iii) the effect of interpolating a velocity field on a grid, found from use of 
the FFT, into values at the arbitrary locations of Lagrangian tracers; 

• (iv) the effect of time-step in an Euler integration of the Lagrangian trajec- 
tory equations. 

The MCMC method that we use is a generalization of the random walk Metropo- 
lis method and is detailed in [4]. The method is appropriate for sampling mea- 
sures absolutely continuous with respect to a Gaussian in the situation where it is 
straightforward to sample directly from the Gaussian itself. We control the error 
(i) simply by running the MCMC method until time averages of various test statis- 
tics have converged; the reader interested in the effect of this Monte Carlo error 
should consult [4J. The error in (ii) is precisely the error which we quantify in 
Theorem 14.41 for the particular experiments used here we predict an error of order 
A^^'' for any q E (0, 1). In this paper we have not analyzed the errors resulting 
from (iii) and (iv): these approximations are not included in the analysis leading 
to Theorem 14.41 However we anticipate that Theorem 12.41 or Theorem 12.61 could 
be used to study such approximations and the numerical evidence which follows 
below is consistent with this conjecture. 
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In the following three numerical experiments (each illustrated by a figure) we 
study the effect of one or more of the approximations (ii), (iii) and (iv) on the em- 
pirical distribution ('histogram') found from marginalizing data from the MCMC 
method onto the real part of the Fourier mode with wavevector k = (0, 1). Similar 
results are found for other Fourier modes although it is important to note that at 
high values of |A;| the data is uninformative and the posterior is very close to the 
prior (see dUl for details). The first two figures use J = 9 Lagrangian trajectories, 
whilst the third uses J = 400. Figure [T] shows the effect of increasing the number 
of Fourier mode^ used from 16, through 100 and 196, to a total of 400 modes 
and illustrates Theorem 14.41 in that convergence to a limit is observed as the num- 
ber of Fourier modes increases. However this experiment is conducted by using 



10r 
9 - 



1 6 Fourier Modes 
- 1 00 Fourier Modes 
1 96 Fourier Modes 
400 Fourier Modes 



.'■ I 
II 
1-1 

i-'l 



Figure 1: Marginal distributions on Re(Mo.i(0)) with differing numbers of Fourier 
modes. 

bilinear interpolation of the velocity field on the grid, in order to obtain off-grid 
velocities required for particle trajectories. At the cost of quadrupling the number 
of FFTs it is possible to implement bicubic interpolation q Conducting the same 
refinement of the number of Fourier modes then yields Figure [U Comparison of 
Figures [Hand [2] shows that the approximation (iii) by increased order of interpo- 
lation leads to improved approximation of the posterior distribution, and Figure [2] 

^Here by number of Fourier modes, we mean the dimension of the Fourier space approxima- 
tion, ie then number of grid points 

^Bicubic interpolation with no added FFTs is also possible by using finite difference methods 
to find the partial derivatives, but at a lower order of accuracy 
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Figure 2: Marginal distributions on Re(Mo,i(0)) with differing numbers of Fourier 
modes, bicubic interpolation used. 

alone again illustrates Theorem I4.4[ Figure [3] shows the effect (iv) of reducing the 
time-step used in the integration of the Lagrangian trajectories. Note that many 
more (400) particles were used to generate the observations leading to this figure 
than were used in the preceding two figures. This explains the quantitatively dif- 
ferent posterior distribution; in particular the variance in the posterior distribution 
is considerably smaller. The result shows clearly that reducing the time-step leads 
to convergence in the posterior distribution. 

5 Eulerian Data Assimilation 

In this section we consider a data assimilation problem that is related to weather 
forecasting applications. In this problem, direct observations are made of the 
velocity field of an incompressible viscous flow at some fixed points in space- 
time, the mathematical model is the two-dimensional Navier-Stokes equations on 
a torus, and the objective is to obtain an estimate of the initial velocity field. The 
spaces Ti and H.^ are as defined in Section H with || • \\s the norm in Ti^ and 
II • II = II • llo- The definitions of A, the Stokes operator, and P, the Leray projector, 
are also as in the previous section [|27ll28l . 
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Figure 3: Marginal distributions on Re(Mo,i(0)) with differing timestep, La- 
grangian data 

We consider the incompressible two-dimensional Navier-Stokes equations 

dv 

— = uAv - (v ■ V)v -Vp + f, (x, t) eD X [0, oo), 
at 

V-v = 0, (x,t) e D X [0,oo), 
V = u, ix,t) e D X {0}, 

where D is a unit square as before and the boundary conditions are periodic. 
We apply the Leray Projector P : Lp^^(D) H and write the Navier-Stokes 
equations as an ordinary differential equation in 

df 

— + vAv + B{v,v) = ij, w(0) = M (5.1) 
dt 

with A the Stokes operator, B{v, v) = P((v ■ V)f ) and ijj = Pf. 

For simplicity we assume that we make noisy observations of the velocity field 
V at time t > and at points Xi, . . . , xk & D'- 

Uk = v(xk,t) + rik, k = l,...,K. 

We assume that the noise is Gaussian and the r^^ form an i.i.d sequence with r]i ~ 
A/"(0, 7^). It is known (see Chapter 3 of l|27l . for example) that for m G 7-^ and 
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/ G L^(0,T;7i'*) with s > a unique solution to (15.11) exists which satisfies 
u e L°°(0, T; 7^1+") C L~(0,T;L°°(D)). Therefore for such initial condition 
and forcing function the value of v at any x E D can be written as a function of 
u. Hence, we can write 

y = Siu) + 7] 

where y = (yi, - ■ ■ , yx)^ E and i] = (r/i, . . . , r]k)^ E is distributed as 
7V(0, 7^/) and 

giu) = {v(xi,t),--- ,v(xK,t)f. (5.2) 

Now consider a Gaussian prior measure fiQ ~ J^iub, PA^") with /? > and 
a > 1; recall that the second condition ensures that functions drawn from the 
prior are in H, by Lemma |A3I In Theorem 3.4 of [[SJ it is shown that with such 
prior measure, the posterior measure of the above inverse problem is well-defined: 



Theorem 5.1 Assume that f E L'^iO, T, W) with s > 0. Consider the Eulerian 
data assimilation problem described above. Define a Gaussian measure /xq on 
Ti, with mean Ub and covariance operator f3 A^'^ for any j3 > and a > 1. If 
Ub E 7i", then the probability measure fi(du) = F(du\y) is absolutely continuous 
with respect to /xq with Radon-Nikodym derivative 



^^^(u) oc exp ^-^\y - G(u)\ij ■ (5.3) 

We now define an approximation fi^ to yU given by (15.31 ). The approximation 
is made by employing the Galerkin approximations of v to define an approximate 
Q. The Galerkin approximation of v, v^, is the solution of 

^ + ^At;^ + P^5(^^,^iv^ = piv^, t;>) = PV (5.4) 
dt 

with as defined in the previous section. Let 

g^(u) = (v'^iXi^t), . . . ^V^'iXK^t))^ 

and then consider the approximate prior measure /i^ defined via its Radon-Nikodym 
derivative with respect to fiQ: 

|!«exp(-^|.-e«(.,||). (5.5) 

Our aim is to show that ji^ converges to /i in the Hellinger metric. Unlike the 
examples in the previous two sections we are unable to obtain sufficient control 
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on the dependence of the error constant on u in the forward error bound to enable 
application of Theorem 12.41 hence we employ Theorem 12.61 In the following 
lemma we obtain a bound on ||f (t) — f ^(t)||L°°(D) and therefore on \Q{u) — Q^ {u)\. 
Following the statement of the lemma, we state and prove the basic approximation 
theorem for this section. The proof of the lemma is given after the statement and 
proof of the approximation theorem for the posterior probability measure. 

Lemma 5.2 Let be the solution of the Galerkin system 0.41) . For any t > 

where ip{N) ^ as N ^ oo. 

The above lemma leads us to the following convergence result for fx^: 

Theorem 5.3 Let fi^ be defined according to 0.51) and let the assumptions of 
Theorem 15. 1 1 hold. Then 

as N —y oo. 

Proof. We apply Theorem 1 2 . 6 1 with X = H. As sumption l2 . 21 (and hence Assump- 
tion [2?T]) is established in Lemma 3.1 of ISJ. By Lemma [5]2] 

\g(u)-g''(u)\^K^{N) 

with K = i^dlnll) and 'ipiN) as ^ 0. Therefore the result follows by 
Theorem [Z6l □ 

Proof of Lemma I5i2l Let ei = f — P^v and 62 = P^v — . Applying P^ to 
(1511) yields 

^—^ + vAP^'v + P^'Biv, v) = P^^. 
at 

Therefore 62 = P^v — satisfies 

^ + uAe2 = P^'Biei + 62, v) + P"" Biv"" , d + 62), 62(0) = 0. (5.6) 
dt 



Since for any and for m > / 



l^ill' ^ jvi^olkllL (5.7) 



we will obtain an upper bound for ||e2||i+«, / > 0, in terms of the Sobolev norms 
of ei and then use the embedding TY^"^' C to conclude the result of the lemma. 
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Taking the inner product of (15.61) with 62, and noting that is self-adjoint 

and P^e2 = 62 and (B(v, w), w) = 0, we obtain 

~\\e2\\^ + ^Peaf = (B{ei + 62, v\ 62) + ei), 62) 

^ II l|l/2|| ||1/2| ||1/2|| ||l/2 , II II II II II II 

^c||ei||' ||ei||/ ||'y||i||e2|| ' ||e2||i + c||e2|| ||e2||i 

I II Af||l/2|| Af||l/2|| II II ||1/2|| ||l/2 

+ c\\v II ' \\v 11/ ||ei||i||e2|| ' ||e2||i 

^ II Il2 II ||2 I II ||2 II II I II I|2 II I|2 

^ c||ei|| ||ei||i + c||f |U|e2|| + c||e2|| \\v\\^ 

N\\ \\„.N\\ II „ II I „|| „ II II „ II I II „ ||2 



II F 111 ll^illi + c||ei||i ||e2|| + -||e2||i 



Therefore 



A(l + ||e2f) + z/||De2f ^ 
at 

c(l + ||'j;||2)(l + ||e2f) + c(l+ ||eif)||ei||? + c||t;^|| ||t;^||i ||ei||i 

which gives 

||e2(t)ir + y \\De2\\' ^ cm {l + £ ||t;^f ||^^^||?dr^ j'^ ||ei||?dr 

+ cm [ (l + ||eif)||ei||?dr. 



with ^ 

p(t) = exp(^c 1 + IK'llidr ) . 

Hence 



||e2(t)|P + z/ r||De2|p^c(l + ||wr)e=+*ll' Al + ||ei|p) ||ei||? dr. (5.8) 
Jo Jo 



To estimate ||e2(t)||s for s < 1, we take the inner product of (15.61) with A'^e2, 
< s < 1 and write 

^^I|e2||' + i^\\e2\\l+s ^ l(((ei + 62) ■ V).;, A^e2)| + \{(v'' ■ V)(ei + 62), A^e2)|. 
Using 

\{{u-V)v,A''w)\ ^ c||M|U|t;||i ||w||i+5 
and Young's inequality we obtain 

^I|e2||^ + H|e2||?+. ^ c(||ei||^ + ||e2||^) \\v\\l + c ||t;^||^ (||ei||? + ||e2||?). 
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Now integrating with respect to t over (to, t) with < to < ^ we can write 

\\e2m' + ^ f\\e2\\l+sdr ^ ||e2(to)||' + c sup ||t;(r)||? T ||eill' + llesH^dr 

Jto T^to Jo 



+ c sup ||t;^(r)||^ / lldll^ + llesll^dr. 



Therefore since for s ^ 1 and t ^ to 



I ,.M|2 ^ ll^f) 



and noting that the same kind of decay bounds that hold for v can be shown simi- 
larly for as well, we have 

I|e2(t)||^+^ ri|e2||?+.dr ^ ||e2(to)||^+^(l + ||n|| V+*"' Al+Heif ) ||ei||?dr. 

Jto '^0 Jo 

Integrating the above inequality with respect to to in (0, t) we obtain 

I|e2(t)||' + ^^ / ||e2||?+,dr^f(to + l+||nf) / (1 + ||eif ) ||ei||?dr (5.9) 

Jto '-O Jo 



for t > to. 

Now we estimate ||e2(t)||s for s > 1. Taking the inner product of (|5.6I) with 

y4^+'e2, < / < 1, we obtain 

^^I|e2||?+, + i^\\e2\\l+i ^ l(((ei + e^) ■ V)v, A'^'e2)\ 

+ |((t^'^- V)(ei + e2),Ai+'e2)|. 

Since (see fll) 

{{U ■ V)V, A^^^w) ^ C \\u\\i+l \\v\\i \\w\\2+l + C \\u\\i \\v\\2 \\w\\2+l 

and using Young's inequality, we can write 

^I|e2||?+i + t^\\e2\\l+i ^ c \\ei\\l^,^\\v\\l + c ||ei||f ||f ||^ 

+ c||e2||i+,||^^||i + c||e2||f||w||2 
+ c|ni?+,||ei||? + cl|^;^l|fl|ei||^ 

I II Af||2 II ||2 I II N\\'i/l\\ ||2 
+ \\l+l\\e2\\l + c\\v 11/ ||e2||i+;. 
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Now we integrate the above inequality with respect to t and over (to/2 + a, t) with 
< to < ^ and < a < t — to/2 and obtain (noting that ||f^||s ^ ||^^||s for any 
s > 0) 

|2 ^ ||„ /J. /o I „M|2 I ll„,/„M|2 / 11 ||2 I 11 ||2 



||e2(t)||t+i ^ \\e2(to/2 + a)\\i^i+ sup ||t;(r)||f / ||ei||f+, + ||e2||f+, dr 

Jto/2+a 



+ sup (||ei(r)||2 + ||e2(r)||2) / \\v\\ldr 

Jto/2+a 



+ sup (||ei(r)||? + ||e2(r)||2) / ||t;^||?+,dr 

+ sup (1 + ) / lleill^ + ||e2||?+^dr. 

We have, for s > 1 and t > to, ([5]) 



c{l + \\uf) 

t^ 



''0 

Therefore using (15.91) and (15.71) we conclude that 
Ik2(i)li;+, < ||e2(io/2 + »)||;+, 

with r > 1 and where Cp(||M||) is a constant depending on polynomials of 
Integrating the above inequality with respect to a over (0, t — to/2) we obtain 

||e2(t)||?+, ^ C.dl^ll) + -1^) Al + lleif ) ||ei||?dr 

ytg tg / Jo 



jY2(m-i)^2+m jY2(r-l) ^2+r 



Now to show that ||eip + ||ei ||^ dr ^ as ^ oo, we note that ei satisfies 

\l\\eir + HDe4 ^ ||(/ - P^)/|| lleill + \mv,v\e,)\\ 

^ ||(/ - P^)/|| lleill + \\vf" WDvfl^ l|ei||i/2 UDeil^/^ 
^ W - P"')/!! Ileill + c ll^lp/^ lleilp + "^WDe.r. 

Therefore 

A||eif + z/||Dei|| ^ c 11(1 - P^)/|p + c(l + ||Dt;f ) ||eif 
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and after integrating, we get 

lleif + ^^||ei||iclr ^ exp(l + Cp(||«||)) (^||ei(0)f + ||(/ - P^)/||2^ dr. 

Since / G L'^{0, T; H), the above integral tend to zero as ^ oo and the result 
follows. □ 

6 Conclusions 

In this paper we have studied the approximation of inverse problems which have 
been regularized by means of a Bayesian formulation. We have developed a gen- 
eral approximation theory which allows for the transfer of approximation results 
for the forward problem into approximation results for the inverse problem. The 
theory clearly separates analysis of the forward problem, in which no probabilistic 
methods are required, and the probabilistic framework for the inverse problem it- 
self: it is simply necessary that the requisite bounds and approximation properties 
for the forward problem hold in a space with full measure under the prior. Indeed 
the approximation theory may be seen to place constraints on the prior, in order 
to ensure the desired robustness. 

In applications there are two sources of error when calculating expectations 
of functions of infinite dimensional random variables: the error which we provide 
an analysis for in this paper, namely the approximation of the measure itself in 
a finite dimensional subspace, together with the error incurred through calcula- 
tion of expectations. The latter can be undertaken by Markov chain-Monte Carlo 
(MCMC) methods, or quasi Monte Carlo methods. The two sources of error must 
be balanced in order to optimize computational cost. 

We have studied three specific applications, all concerned with determining 
the initial condition of a dissipative PDE, from observations of various kinds, at 
positive times. However the general approach is applicable to a range of inverse 
problems for functions when formulated in a Bayesian fashion. The article [|25l 
overviews many applications from this point of view. Furthermore we have lim- 
ited our approximation of the underlying forward problem to spectral methods. 
However we anticipate that the general approach will be useful for the analysis 
of other spatial approximations based on finite element methods, for example, 
and to approximation errors resulting from time-discretization; indeed it would be 
interesting to carry out analyses for such approximations. 

It is important to realize that new approaches to the computation of expecta- 
tions against measures on infinite dimensional spaces are currently an active area 
of research in the engineering community |l23l|24l| and that a numerical analysis of 
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this area is being systematically developed [|22ll29l . That work is primarily con- 
cerned with approximating measures which are the push forward, under a nonlin- 
ear map, of a simple measure with product strcuture, such as a Gaussian measure; 
in contrast the inverse problem setting which we study here is concerned with 
the approximation of non-Gaussian measures whose Radon-Nikodym derivative 
is defined through a related nonlinear map. It would be interesting to combine the 
approaches in [|23l l22l |29l and related literature with the approximation theories 
described in this paper. For example that work could be used to develop cheap ap- 
proximations to the forward map Q thereby accelerating MCMC-based sampling 
methods. 
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Appendix A Analytic Semigroups and Probability 

We collect together some basic facts concerning analytic semigroups and prob- 
ability required in the main body of the article. First we state the well-known 
Gronwall inequality in the form in which we will use iS 

Lemma A.l Let I = [c, d) with d G (c, oo]. Assume that a,u e 0(1; and 
that there is X < oo such that, for all intervals J I, Jj l3(s)ds < X. If 

u{t) ^ a{t) + l3{s)u(s)ds, t E I, 

then ^ ^ 

u(t) ^ a{t) + a(s)(3(s)exp(^j^ p(r)dr^ds, t e I. 

In particular, ifa(t) = u + 2at is positive in I and f3{t) = 2b then 

u(t) ^ exp(2bt)u + ^ (expi2bt) - 1 j , tel. 

Finally, ifc = 0, and < a(t) ^ K in I, then 

u(t) ^K + KX exp(A), t E I. 

Throughout this article A denotes either the Laplacian on a smooth, bounded 
domain in with Dirichlet boundary conditions (section [3]) or the Stokes oper- 
ator on (sections |4] and [5]). In both cases A is a self-adjoint positive operator 
A, densely defined on a Hilbert space H, and the generator of an analytic semi- 
group. We denote by {{(l)k, Xk)}keK a complete orthonormal set of eigenfunc- 
tions/eigenvalues for A'mTi. We then define fractional powers of A by 

A°u = 5^A^(u,0fc)0fc. (A.l) 

For any s G M we define the Hilbert spaces Ti'^ by 

7^^ = {n:^A^|(M,0fc)|2<oo}. (A.2) 

fceK 

The norm in is denoted by || ■ \\s and is given by 

\\u\\l = Y,K\{uAk)?. 



See http : / /en . wikipedia . org/wiki/Gronwall' s_inequality 
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Of course = H. If s > then these spaces are contained in H, but for s < 
they are larger than H. It follows that the domain of A" is H^"; the image of A^" 

is n^". 

Now consider the Hilbert-space valued ODE 

dv 

— + Av = f, viO) = u. (A.3) 
at 

We state some basic results in this area, provable by use of the techniques in [|20l . 
for example, or by direct calculation using the eigenbasis for A. For / = the 
solution V e C([0, oo), H) n C\{0, oo), D{A)) and 

\\v\\l ^ Ct-^'-^'>\\u\\l VtG(0,T]. (A.4) 

If / G C([0,T],W) for some 7^0, then (IA.3I) has a unique mild solution 
u e C([0, T]; and, for ^ £ < 7 + 2, 

^ ^( + ll/llc^([o,T],7i.)) (A.5) 

for se[i,2 + 7). 

It central to this paper to estimate the distance between two probability mea- 
sures. To this end we introduce two useful metrics on measures: the total vari- 
ation distance and the Hellinger distance. We discuss the relationships between 
the metrics and indicate how they may be used to estimate differences between 
expectations of random variables under two different measures. 

Assume that we have two probability measures jj, and /x', both absolutely con- 
tinuous with respect to the same reference measure u. The following defines two 
concepts of distance between /i and /i'. 

Definition A.2 The total variation distance between jj, and /i' is 



Ttie Hellinger distance between /i and fi' is 



djj, dfi' 
dv dv 



dv. 




Both distances are invariant under the choice of v in that they are unchanged if 
a different reference measure, with respect to which fi and n' are absolutely contin- 
uous, is used. Furthermore, it follows from the definitions that d^yifx, jj,') G (0, 1) 
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and (iHeii(/^jju') G (0, 1). The Hellinger and total variation distances are related as 
follows !#(: 

—^dTyifi, fi') ^ c?Heii(/i, yu') ^ d^yifi, fi')^. (A.6) 

The Hellinger distance is particularly useful for estimating the difference be- 
tween expectation values of functions of random variables under different mea- 
sures. This is illustrated in the following lemma: 



Lemma A.3 Assume that two measures /i and n' on a Banach space \^X, \\ ■ \\x 
are both absolutely continuous with respect to a measure v. Assume also that 
f : X ^ Z, where (^Z, \\ ■ \\j is a Banach space, has second moments with 
respect to both ji and jj,'. Then 



|E^/-E^7ll ^ 2(E^||/f + E' 



2^ 2 



Furthermore, if yZ, {■,■) j is a Hilbert space and f : X ^ Z has fourth moments 
then 

WE'^f ® / - E^7 ® /IK 2fE^||/r + E'^'ll/r) '4eu(/i, fi'y 



Proof. We have 

||E^/-E^7ll ^ j 



dfi dfi' 
dv dv 



dv 



'dfi dfi' 
du V du 



V2II/II 



'd/i d^' 
du V du 



du 



1 f / djj M/i'\2 



du 



du J 



du 



'djj d^'V^ 



dfi dfi' 



^i^ + ^]du 
du du 



2(^E^||/f + E^'"^"2 



as required. 



Note that different normalization constants are sometimes used in the definitions of distance. 
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The proof for / / follows from the following inequalities, and then arguing 
similarly to the case for the norm of /: 

llE'^/^Z-E^'/^/IH sup ||E^(/,/i)/-E^'(/,/i)/|| 

||h||=i 

< / ii/f - ^ 

J dv dv 

□ 

Note, in particular, that choosing X = Z, and with / chosen to be the identity 
mapping, we deduce that the differences in mean and covariance operators un- 
der two measures are bounded above by the Hellinger distance between the two 
measures. 

The following Fernique Theorem (see [|2TI . Theorem 2.6) will be used repeat- 
edly: 

Theorem A.4 Let x ~ = A^CO, C) where ^ is a Gaussian measure on Hilbert 
space H. Assume that fio(X) = 1 for some Banach space (^X, \\ ■ \\x^ with 
X C H. Then there exists a > such that 

ex-p{a\\x\\x) fJ'idx) < oo. 

The following regularity properties of Gaussian random fields will be useful 
to us; the results may be proved by use of the Kolmogorov continuity criterion, 
together with the Karhunen-Loeve expansion (see ETI . section 3.2): 

Lemma A.5 Consider a Gaussian measure ji = Af{0, C) with C = j3A~" where 
A is as defined earlier in this Appendix A. Then u fi is almost surely s— Holder 
continuous for any exponent s < min{l, a — |} and u E W, jj,— almost surely, 
for any s < a — ^. 



